PQTable: Non-exhaustive Fast Search for Product-quantized Codes using Hash Tables
نویسندگان
چکیده
In this paper, we propose a product quantization table (PQTable); a fast search method for product-quantized codes via hash-tables. An identifier of each database vector is associated with the slot of a hash table by using its PQcode as a key. For querying, an input vector is PQ-encoded and hashed, and the items associated with that code are then retrieved. The proposed PQTable produces the same results as a linear PQ scan, and is 10 to 10 times faster. Although stateof-the-art performance can be achieved by previous invertedindexing-based approaches, such methods require manuallydesigned parameter setting and significant training; our PQTable is free of these limitations, and therefore offers a practical and effective solution for real-world problems. Specifically, when the vectors are highly compressed, our PQTable achieves one of the fastest search performances on a single CPU to date with significantly efficient memory usage (0.059 ms per query over 10 data points with just 5.5 GB memory consumption). Finally, we show that our proposed PQTable can naturally handle the codes of an optimized product quantization (OPQTable).
منابع مشابه
Cosine Similarity Search with Multi Index Hashing
Due to rapid development of the Internet, recent years have witnessed an explosion in the rate of data generation. Dealing with data at current scales brings up unprecedented challenges. From the algorithmic view point, executing existing linear algorithms in information retrieval and machine learning on such tremendous amounts of data incur intolerable computational and storage costs. To addre...
متن کاملHashing with dual complementary projection learning for fast image retrieval
Due to explosive growth of visual content on the web, there is an emerging need of fast similarity search to efficiently exploit such enormous web contents from very large databases. Recently, hashing has become very popular for efficient nearest neighbor search in large scale applications. However, many traditional hashing methods learn the binary codes in a single shot or only employ a single...
متن کاملApproximate Nearest Neighbor Search by Residual Vector Quantization
A recently proposed product quantization method is efficient for large scale approximate nearest neighbor search, however, its performance on unstructured vectors is limited. This paper introduces residual vector quantization based approaches that are appropriate for unstructured vectors. Database vectors are quantized by residual vector quantizer. The reproductions are represented by short cod...
متن کاملMixed-Resolution Patch-Matching
Matching patches of a source image with patches of itself or a target image is a first step for many operations. Finding the optimum nearest-neighbors of each patch using a global search of the image is expensive. Optimality is often sacrificed for speed as a result. We present the Mixed-Resolution Patch-Matching (MRPM) algorithm that uses a pyramid representation to perform fast global search....
متن کاملLearning Binary Hash Codes for Large-Scale Image Search
Algorithms to rapidly search massive image or video collections are critical for many vision applications, including visual search, content-based retrieval, and non-parametric models for object recognition. Recent work shows that learned binary projections are a powerful way to index large collections according to their content. The basic idea is to formulate the projections so as to approximat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1704.06556 شماره
صفحات -
تاریخ انتشار 2017